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Abstract 

The spatial random-effects model is flexible in modeling spatial covariance func¬ 
tions, and is computationally efficient for spatial prediction via fixed rank kriging. 
However, the success of this model depends on an appropriate set of basis functions. 
In this research, we propose a class of basis functions extracted from thin-plate splines. 
These functions are ordered in terms of their degrees of smoothness with a higher-order 
function corresponding to larger-scale features and a lower-order one corresponding to 
smaller-scale details, leading to a parsimonious representation for a nonstationary spa¬ 
tial covariance function. Consequently, only a small to moderate number of functions 
are needed in a spatial random-effects model. The proposed class of basis functions 
has several advantages over commonly used ones. First, we do not need to concern 
about the allocation of the basis functions, but simply select the total number of func¬ 
tions corresponding to a resolution. Second, only a small number of basis functions is 
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usually required, which facilitates computation. Third, estimation variability of model 
parameters can be considerably reduced, and hence more precise covariance function 
estimates can be obtained. Fourth, the proposed basis functions depend only on the 
data locations but not the measurements taken at those locations, and are applicable 
regardless of whether the data locations are sparse or irregularly spaced. In addi¬ 
tion, we derive a simple close-form expression for the maximum likelihood estimates of 
model parameters in the spatial random-effects model. Some numerical examples are 
provided to demonstrate the effectiveness of the proposed method. 

Keywords: Fixed rank kriging, nonstationary spatial covariance function, smoothing 
splines, thin-plate splines. 


1 Introduction 

Consider a sequence of independent spatial processes, {y{s,t) : s G D}; t = 1,...,T, 
defined on a d-dimensional spatial domain D C The processes are assumed to have 
mean and a common spatial covariance function s*) = cov{y{s,t), y{s*,t)), for 

t = 1,... ,T. Suppose that we observe data Zt = ... ,z{Sn,t))'; t = 1,... ,T, at n 

distinct locations, Si,..., Sn E D, with additive white noise £t according to 


zt = yt + £t-, t = 


( 1 ) 


where yt = (|/(si, f),..., |/(s„, t))', £t iV(0, cr'^In) is uncorrelated with yt, and £ts are 
mutually uncorrelated. The goal is to estimate C(-, ■) and predict y{-,t); t = 1,... ,T, based 
on Zi,Z t without imposing a stationary assumption or a parametric structure. 

We consider the spatial random-effects model (e.g., Cressie and Johannesson| 2008 Wikle 
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2010 Lemos and Sanso, 2012): 


y{s, t) = /i(s, t) + w'J{s) + ^(s, t) 


K 


+ ^Wk{t)fk{s) +^{s,t)] s e D,t = l,...,T, 


( 2 ) 


k=l 


where /fc(-)’s are pre-specified basis functions with K < n, f{s) = (/i(s),...,/i^(s))', 
, WKif))' ~ A^(0, M)] t = 1,..., T, are random effects, and ^(s, t) ~ iV(0, (t|) 
is a white-noise process. Here Wts and ^(s,t)’s are mutually uncorrelated. This model 
is flexible for modeling stationary or nonstationary spatial covariance functions and can 


produce fast prediction (e.g., Wikle, 2010). The spatial covariance function is 


C(s, s*) = cov(?/(s, t),y{s\t)) = f{s)'Mf{s*) + ap{s = s*); s, s* G B. 


(3) 


Given {/i(-),...,/x(-)}, the model @ depends only on the parameters Af and (j|. 
Many approaches have been proposed to estimate these parameters, including a method of 


moments (Cressie and Johannesson, 2008) and maximum likelihood (Katzfuss and Cressie 


2009). Commonly used basis functions include radial basis functions (e.g., Cressie and 


Johannesson, 2008 and Nychka et ah, 2015), discrete kernel basis functions (e.g., Barry et al. 


1996 and Wikle, 2010), and wavelets (e.g., Nychka et ah, 2002j and Shi and Cressie, 2007). 
Although wavelet basis functions are advantageous to have multi-resolution features, they are 
mainly restricted for data observed on a regular grid with no (or few) missing observations. 
In general, different basis functions work well under different situations. However, how to 
select and allocate the basis functions (e.g., centers and radii) is an art and has rarely been 
discussed in the literature. 

In what follows, we provide some examples showing how estimation of AT and (t|, and 
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thus C'(-.-), is affected by the choice of the following bisquare (radial) basis functions: 

fk{s) = (^1 ~ ^ - ^k\\ < Tk), (4) 

which is centered at and has a local bounded support {s G : ||s — hfc|| < r^} controlled 
by a radius r^, for A; = 1,..., 


Example 1 Assume that the underlying covariance function is given by the spatial random- 
effects model of ([^ with D = [0,1], = 6, M = diag(17,14,11, 8, 5,2), cr| = 0, and fl^\-) ’s 

given by (|^ (see Figure^ (o-V); where = 0.2(fc — 1); fc = 1,..., 6 and ri = ■ ■ ■ = ke = 0.5. 
Then the spatial covariance function is C^^\s,s') = (Figure 0 (<‘V). 

where /<“)(«) = (/!“’(«). ■■■./e”!*))'’ 

To mimic a situation in practice, instead of approximating in Example 0 using 

/*“>(■). we consider a different set of bisque basis functions, = {f[^\s),... ,/g^^(s))', 

formed by bk = 0.11(fc — 1) + 0.06; k = 1,..., 9 and ri = ■ ■ ■ = rg = 0.165 (Figure (bl)). 
Let be the optimal matrix that minimizes the integrated squared error ISE(/^^\fVf) 

over all non-negative definite 9x9 matrix M, where 


ISE(/,M)= [ [ {f{syMfis*)-C^^\s,s*)Ydsds*. 
J D Jd 


(5) 


Then the covariance function that has the smallest ISE based on is C'*^^^(s, s*) = 

/d)(s)'M'(^)/(^)(s*) (Figure (b2)). The approximation can be seen to be poor, because 
hfc’s and rfs are not well chosen, despite that a larger number of basis functions are used 
and the approximation involves no estimation error. 

Now consider another set of bisquare basis functions, /^^^(s) = {f[^\s),...,fg^\s)y 
to approximation C'^°^(-,-), where b^ = 0.18(A: — 1) -|- 0.05; k = 1,...,6 and ri = ■■■ = 
rg = 0.27 (see Figure (d))- Here rfs are determined by 1.5 times the minimal distance 


between h^’s as suggested by Cressie and Johannesson (2008). Similar to C^^^(-, •), the best 
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covariance function based on is (Figure 0(02)). 

Although •) is smoother than •), it produces a larger bias. Clearly, the quality 

of approximation highly depends on the choice of K, b^s and r^’s. 

Instead of selecting b^s and r^’s for the bisquare functions of (|^, we shall propose a new 
class of basis functions, which involves no selection of centers and radii, and are ordered in 
terms of their degrees of smoothness. Figure (dl) shows a class of iC = 6 basis functions 
obtained from our method, which will be introduced in Section The covariance function 
based on this class of functions is shown in Figure (d2). Comparing it to •) and 

■), a significant improvement can be seen even though only 6 functions are used. 

To further investigate the effect of b^s and r^’s in covariance function estimation, we 
consider two additional examples. For the first example, we apply the same basis functions 
of except that ri = • • • = rg = r G [0.25,0.9]. Figure]^ (a) shows how the ISE 

of ([^ varies as a function of r. Not surprisingly, covariance function estimation is highly 
affected by r. For the second example, we consider the same bisque functions of (|^ with 
bk = 0.2{k — 1) + A; k = 1,..., 7 and ri = ■ ■ ■ = r 7 = 0.5, similar to those in Example 

These can be regarded as shifted versions of controlled by a shift parameter A. 

Figure (b) shows the ISE of (|^ with respect to A G [—0.2,0]. While ISE is less affected 
by A than r in the first example, a poorly chosen A can still cause some significant bias in 
covariance function estimation. 

In this research, we propose a class of basis functions extracted from thin-plate splines. 
These functions are ordered in terms of their degrees of smoothness with a higher-order func¬ 
tion corresponding to larger-scale features and a lower-order one corresponding to smaller- 
scale details, leading to a parsimonious representation for a nonstationary spatial covariance 
function. Consequently, only a small to moderate number of functions are needed in a spa¬ 
tial random-effects model. The proposed class of basis functions has several advantages over 
commonly used ones. First, we do not need to concern about the allocation of the basis func¬ 
tions, but simply select the total number of functions corresponding to a resolution. Second, 
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Figure 1: (al) Six basis functions corresponding to (a2) The true spatial covariance 

function; (bl) Nine basis functions corresponding to (b2) Spatial covariance function 

obtained from /^^^(■); (cl) Six basis functions corresponding to /^^^(■); (c2) Spatial covari¬ 
ance function obtained from /^^^(■); (dl) Six basis functions from the proposed method; (d2) 
Spatial covariance function obtained from the six proposed basis functions. 
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Figure 2: (a) ISE values with respect to based on six basis functions of Q; (b) ISE values 
with respect to A with bk = 0.2{k — 1) + A based on seven basis functions of Q. 


only a small number of basis functions is usually required, which facilitates computation. 
Third, estimation variability of model parameters can be considerably reduced, and hence 
more precise covariance function estimates can be obtained. Fourth, the proposed basis func¬ 
tions depend only on the data locations but not the measurements taken at those locations, 
and are applicable regardless of whether the data locations are sparse or irregularly spaced. 

The rest of the article is organized as follows. Section 2 introduces the proposed class of 
basis functions. In Section 3, we apply the proposed basis functions to spatial random-effects 
models, and derive simple close-form expressions for the maximum likelihood estimates of 
the model parameters. Some simulation examples and an application to a daily-temperature 
dataset in Canada are presented in Section 4. 

2 The Proposed Ordered Set of Basis Functions 

The proposed class of basis functions will be developed using thin-plate splines (TPSs). We 
shall first provide some basic knowledge about TPS. Given noisy data Zi,..., Zn observed 
at n distinct control points, Si,..., G a TPS function /(s); s G can be obtained 
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by minimizing 

where s = (xi,..., Xd)'i 

J(S) = 


Y,{Zi-f{sdr-+pj{f), 


( 6 ) 


2 = 1 


E 

i^iH-hh'd—2 


2 ! 




ui\ ■ ■ ■ Udl \dxi^ ■ ■ ■ dx’^/ 


ds > 0, 


( 7 ) 


is a smoothness penalty, and p > 0 is a tuning parameter. It is known that (e.g., Wahba 


and Wendelberger, 1980 Green and Silverman, 1993) for p > 0, the solution of ([^ satisfies 

d 

f{s) = q:'0(s) + /do + fdjXj subject to X'a. = 0, (8) 

i=i 

where Si = {xn, ..., Xid)'] i = 1,... ,n, 


X = 



\ 

1 Xii ■ 

■ Xid 

y^l Xin\_ 

^nd j 


(9) 


and 4>{s) = , </>„(«))' with 


9* s = 


1 

12 

1 




87r 

- 1 , 


s — s,- 


if (i = 1, 
log(||s - Sjll); ifd = 2, 


( 10 ) 


s - Si 


if (i = 3. 


A function /(s) in the form of (1^ is called a natural TPS function. It has been shown that 


e.g., Theorem 7.1 in Green and Silverman, 1993) 


J(/) = 


(11) 



















where $ is the n x n matrix with the (i, j)-th element 

Assume that rank(X) = d + 1. We shall introduce our basis functions from the natural 
TPS function space: 

d 

7 = !/(■) : /(s) = a'0(s) + /So + a G M ^/3 G X'a = o}, (12) 

i=i 


where /3 = {(3q, /3i, ..., (3d)'■ The proposed basis functions form a basis of 7, and are dehned 
as 

1; k = l, 

fkis) = < Xk-i, A: = 2,..., d + 1, (13) 

A^Vi{0(^) - ^'X{X'X)~^xyvk-d-i}; k = d + 2,...,n, 

where x = (1, s')' = ... ,Xd)', Vk is the fc-th column of V, Vdiag(Ai,..., \n)V' is the 

eigen-decomposition of Q^Q with Ai > ■ ■ ■ > A„, and Q = I — X{X'X)~^X'. Note that 
ol'^ol > 0 for all q: 7 ^ 0 with X'ol = 0 (see Section 4 of Micchelli ( |1986 )). Consequently, 
a'Q^Qa > 0 for all a satisfying Qa ^ 0, which implies rank(Q$Q) = rank(Q) = n — d— 1. 
Thus Ai > • • ■ > \n-d-i > 0, and hence fd+ 2 {-), ■ ■ ■, /n(') are well dehned. 

The following theorem gives some important properties of these basis functions with its 
proof given in Appendix. 


Theorem 1 Consider fk{-)’s in (13), in (12), andJ{f ) in ([^, and assume that Tank{X) 
d+l < n. Then 


n 


A:=l 


■ ,/d+i(-)} a basis of {g{-) e7 : J{g) = 0}. 


(Hi) For k = d + 2,... ,n, define 


7k = |d(-) e 7 : ^^(Si)^ = 1, '^g{si)fj{si) = 0; j = 1,..., A; - l|. (14) 

i=l i=l 
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Then aigmin J{g) = fk{-) and J{fk) = 

Remark 1 Let fk = (/fc(si), • • • ,/fc(Sn))'; k 
k*), for k,k* = d + 2,... ,n. 


K-d-i, fork = d + 2,...,n. 

= 1,..., n. Then f^X = 0 and fj^fk* 


I{k = 


Remark 2 The basis functions are given in a decreasing order in terms of their degrees 
of smoothness with 0 = J(/i) = ■■■ = J{fd+i) < J{fd+ 2 ) < ■■■ < J{fn)- addition, 
fk{-) is the smoothest function that is orthogonal to fi{-), ■ ■ ■, fk-i{-), for k = d + 2,... ,n. 
This enables a spatial process to be more parsimoniously represented in the spatial random- 
effects model, particularly when the underlying spatial covariance function is smooth. A 
one-dimensional example of f 2 {-) ,■■■, fsof) with n = 50 and Si = i/50; i = is 

shown in Figure^ 


Remark 3 Another basis of if is the Demmler-Reins eh basis (Demmler and Reinsch, 1975) 
given by 


ihis ),..., hMY = U'iix, ^Nfix, s', 0(s)'Ar)', 

where N is an n x {n — d — 1) matrix such that NN' = Q and N'N = In-d-i, and 
C/diag(ai,..., an)U' is the eigen-decomposition of 


{{X,^Ny{X,^N)) 


0 0 
0 




with tti > ■ ■ ■ > On- While hif ),..., /?.„(■) are orthogonal and satisfy Jfhi) < ■ ■ ■ < J{hn), 
they generally do not have the property of Theorem (Hi). Additionally, they are more 
expensive to compute since ((X, $A/’)'(X, involves O(n^) computations. 


Our method given by (13) requires computing only the hrst K eigenvalue and eigenvector 
pairs of Q^Q without the need to solve the full eigen-decomposition problem. In addition, 
we can compute Q^Q = Q- X'{X'Q) via X = {X'X)-^X' and Q = $ - ($X)X to 
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reduce the computations of Q^Q from O(n^) in terms of direct matrix multiplication to 
0{'n?d). The hrst K eigen-functions and eigenvalues can be efficiently obtained using some 


numerical techniques, such as the QR method and the Lanczos method (see e.g., Golub and 


van der Vorstf 2000 Ordonez et al.|20T4) via an R package such as “bigpca” or “onlinePCA’' 


Both packages are available on Comprehensive R Archive Network (GRAN). 

To know how the proposed basis functions perform in representing •) of Example 1, 

we consider six basis functions /i(-)) • • • > f&{') (see Figure (dl)) derived from our method 
with the controlled points given at Sj = z/50; i = 1,...,50, as in Figure The best 
covariance function that minimizes (|^ is shown in Figure (d2). Clearly, it provides a much 
better approximation to the true spatial covariance function than those in Figure [^(b2) and 
(c2) based on /d)(-) and 

To illustrate how the proposed basis functions provide a multi-resolution covariance func- 
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tion representation, we consider a spatially deformed exponential covariance function: 


C(s,s*) = exp{ -2|(s + 0.5)-^-^ - (s* + 0.5)-^-®|}; s, s* e [0,1] 


(see Figure]^ (a)), which is a nonstationary covariance function constructed by applying a 
deformation transformation (s —)■ (s + 0.5)“^'^) to a stationary covariance function as in 

. We apply our basis functions (see Figure to approximate 
this covariance function, where the controlled points are given at Sj = i/50; i = 1,... ,50. 
The results for three different numbers of basis functions {K = 8,15, 30) are shown in Figure 
1^ (b)-(d), respectively. As you can see, large-scale features can be captured even if K is 
merely 8. On the other hand, hner-resolution details are captured by /fc(-) with larger k 
values. 

The proposed class of basis functions is even more effective in the two-dimensional 
space. Suppose that we would like to approximate an exponential covariance function, 
0(s, s*) = 20exp(—0.4||s —s*||) for s, s* G [0,1]^, using/(s)'Af/(s). We compare between 
a conventional method and our method. For a conventional method, we consider the natural 
TPS functions for /(■) formed by 1, xi, X 2 and 


Sampson and Guttorp (1992 


1 

Stt 


s — 


ji _' 

L + VL + lJ 


log 



ji _ 

L + VL + lJ 


1 < £i, £2 ^ F, 


with their centers regularly location in [0,1]^ for L G {3,5,7,9,11,13}, corresponding to 
a total of {12,28,52,84,124,172} basis functions. We apply our method with the control 
points, {((2ji — l)/36,(2j2 — l)/36) : 1 < ji,j 2 < 18}, regularly located in [0,1]^, and 
consider the same numbers of basis functions for comparison. The performance between 
the conventional basis functions and the proposed basis functions is shown in Table For 
all cases, the proposed basis functions provide much better approximation ability than the 
conventional basis functions. 
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Figure 4: (a) A nonstationary spatial covariance function; (b) covariance function approxi¬ 
mation based on 8 basis functions; (c) covariance function approximation based on 15 basis 
functions; (d) covariance function approximation based on 30 basis functions. 

3 Parameter Estimation 


Consider the spatial random-effects model given by ([^ and ([^. For simplicity, we assume 
that p(s,f) = 0 and is known, since and (j| are confounded together unless some 
additional information is available. Given the basis functions /i(-),..., fK{-), the parameters 
that need to be estimated are M, which has to be non-negative dehnite, and (t| > 0. 
Although the ML estimates Mk and of M and (t| can be computed using the EM 


algorithm (Katzfuss and Cressie, 2009), as shown in the following theorem, a closed-form 
expression for can be derived with its proof given in Appendix. The estimate (j| can 
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Table 1: ISE performance between TPS basis functions and the proposed basis functions for 
various numbers of functions. 


number of basis functions 

TPS 

Proposed 

3^3 

0.09462 

0.01895 

52+3 

0.01505 

0.00301 

72+3 

0.00416 

0.00085 

92+3 

0.00155 

0.00037 

11^+3 

0.00070 

0.00021 

132+3 

0.00037 

0.00015 


be computed using a simple one-dimensional optimization method. 

Theorem 2 Consider the model given by 0 and with /r(s,t) = 0 and known. Then 
the ML estimates of M and cr| are given by 


;i-2 

- 

arg min 



Mk = 

{F^Fk. 


T 


tr(>g) 
ct| + 

- 1/2 


^ r 

^ log {dK,k + (T^ + of) - 

k=l ^ 


dK,kdK,k I , . , 2 , 2^ 

2 , 2 \ + {n-K) log(a^ + aj 

ot + at ' 


= Pk diag(dx,i,..., dK,K)PK (F^Fk)-' , 


where S = '^Ztz'jT, Fk = fk = (/fc(si),...,/fe(s„))'; k = 


t=i 


Pk diag{dK i,..., dK,K)PK ^^e eigen-decomposition of (F^Fk) ^^^Ff-SFx {Ff-Fj 


K) 


.- 1/2 


and dK,k = max {dx^k 




,0); A; = 1,... ,iP. 


In practice, we propose to select K G {d -|- 1,..., K*} for a sufficiently large K* using 
Akaike’s information criterion (AIC, Akaike, 1973, 1974|): 


AlC(iP) 


T log I Sfc I + Ttr (5S^') + + 2 

^ I log {dK,k + oIk + 0 -,^) 


dK,kdK,k 


o: 


i,K 


+ oi 


+ K^ + K + 2, 


where = FkMkF^, + {oIj, + a‘1)In- Then the hnal number of basis functions selected 
by AIC is K = argmin AlC(iP). Plugging in and ^ into the best linear unbiased 

d+l<K<K* 
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predictor of y{s,t), we obtain 




^ ^ D, t — 1,... ,T, (15) 


where S - is the Moore-Penrose inverse of Sand can be efficiently computed by 



j-n ^ K^k diS'S (, |"2 



^k^k • • •; 



( 16 ) 


and 


4 Numeric Examples 

4.1 Simulation 

In the simulation, we considered spatial processes, {|/(s, t) : s E [0,1]^} for t = 1,..., 50, gen¬ 
erated from (|^ with /i(s,t) = 0, /i(s) = cos( 7 r||s —(0,1)'||), / 2 (s) = cos(27r||s —(3/4,1/4)'||), 
and {wi{t),W 2 {t)y ~ A^(0, diag(25,9)), where /i(-) and / 2 (-) are shown in Figurej^ We gen¬ 
erated data, 2 i,..., 250 , according to ([^ with n = 100 and = 3, where Si,..., were 
taken from D = [0,1]^ using simple random sampling. 

We applied the spatial random-effects model of ([^ and (|^ and the ML estimates given 
by Theorem to estimate the underlying spatial covariance function with cr/ = 3 assumed 
known. We considered commonly used bisquare basis functions given in (|^ with six different 
layouts for function centers and radii at two resolutions (see Table |^. We applied the 
proposed basis functions and selected the number of basis functions among K E {3,, 20} 
using AIC. We also considered the exponential covariance model and the true covariance 
function for comparison. All the model parameters were estimated by ML. 

The performance of various methods was compared in terms of the mean-squared-prediction- 
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Table 2: Various layouts for centers of the bisque basis functions. 


Layout 

Coarse Resolution 

Fine Resolution 

K 

Center 

Radius 

Center 

Radius 

1 

{o,ir 

3/2 

11/4,3/4}^ 

3/4 

8 

2 

{1/6,5/6}2 

1 

(0,1/2,1)2 

3/2 

13 

3 

{1/6,5/6}2u(1/2,1/2) 

V2/2 

{0,1/2,1}2 

3/2 

14 

4 

{0,1/2, IF 

3/4 

{1/6,1/2,5/6}2 

1/2 

18 

5 

{1/6,5/6F 

1 

(0,1/3, 2/3,1}2 

1/2 

20 

6 

{1/6,5/6Fu(1/2,1/2) 

V2/2 

(0,1/3, 2/3,1}2 

1/2 

21 


error (MSPE) criterion: 


1 



E{yis,t) -y{s,t)f, 


where y{s,t) is a generic predictor of y{s,t) obtained from simple kriging based on Zt using 
an (estimated) spatial covariance model. The results based on 200 simulation replicates are 
shown in Table Not surprisingly, bisquare basis functions perform well for some cases 
but poorly for others. In contrast, our method performs better than all the other spatial 
covariance estimation methods by having a smaller averaged MSPE value. The hrst and 
the third quantiles for the distribution of the number of basis functions selected by AIC are 
about 10 and 12, indicating that only a small number of basis functions is required. 
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Table 3: Averaged MSPEs for various methods based on 200 simulation replicates. Values 
given in parentheses are the corresponding standard errors. 


True 

Exponential 

Our 


Bisque Basis Functions 





1 

2 

3 

4 5 

6 

0.123 

1.234 

0.646 

0.694 

0.872 

1.063 

0.962 1.013 

1.191 

(0.015) 

(0.017) 

(0.015) 

(0.024) 

(0.018) 

(0.031) 

(0.034) (0.032) 

(0.037) 


4.2 Application to Canadian Temperature Data 


We applied the proposed method to an average daily temperature dataset. The data, avail¬ 
able in the “fda” package on CRAN, consist of average temperatures for each day of the 
year at 35 weather stations in Canada, which are averaged over years 1960 to 1994. They 


have been analyzed by Ramsay and Dalzell (1991) and Silverman and Ramsay (2005) using 


functional data analysis techniques without considering spatial dependence. 

Let z{si,t) be the average daily temperature at location s* and day t, where Sj is given 
with coordinates in latitude and longitude in units of degrees. We considered the spatial 
random-effects model of ([^ and (|^ with n = 35 and T = 365. Since the temporal patterns 


are known to be different at different stations (see e.g., Silverman, 1995), we considered a 


semiparametric model (]Buja et ah, 1989) for with station-specific quadratic effects: 


/i(s, t) = mo{t) -|- m(s) i{s)t + s G D, t = 1,..., 365, (17) 


where and q{-) are unknown smooth functions, and for identihcation purpose. 


35 


35 


35 


we assume 


= 5^£(si) = 5^g(si) = 0. 


2 = 1 


2 = 1 


2=1 


We considered a two-step procedure to fit /x(-, •) with the smoothness parameter selected 


by using Mallow’s Cp ( |Hastie and Tibshirani| 1990). First, we obtained the estimates rhi, ti 
and qi of m(sj), £{si) and q{si) for i = 1,..., 35, and the estimate mo(') of ^o(') using the R 
package “gam” available on CRAN (see Figure]^ (a)). Then we separately applied smoothing 
splines to mds, £j’s and gds and obtained the estimates mj(-), £{■) and g(-) (see Figure |^(b)- 
(d)) with the smoothing parameter selected by generalized cross-validation ([Golub et al. 
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Figure 6: Estimated functions in (17): (a) mo(t); (b) m(s); (c) f'(s); (d) q{s). 


1979). Then we assume that /i(s,t) is known as /i(s,t) = mo(t) + rh{s) + i{s)t + q{s)t^ for 


covariance function estimation. 

We randomly divided the data into two parts with one part consisting of 185 time points 
as the training data, and the other part consisting of 180 time points as the testing data. We 
applied the spatial random-effects model of ([^ and (|^. We assumed that (j| = 0, but 
is unknown, and applied ML with the proposed basis functions to estimate the underlying 
spatial covariance function. We also considered applying the exponential covariance model 
to estimate covariance function with the parameters estimated by ML. 

The performance of the two covariance function estimates is evaluated in terms of the 
Frobenius loss, Lossi? = ||S —StestH and the Kullbeck-Leibler loss, Lossici = ^{tr(S“^5'test) + 
log |S| — log latest I —35}, where S is a generic estimate of S and Stest is the sample covariance 
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matrix based on the testing data. The validation procedure was repeated 100 times. The 
average Lossp and hossi^L based on our method are 10.0 and 4.7 respectively, which are 
much smaller than 177.1 and 25.7 based on the exponential covariance model, which is not 
surprising, because the data are highly nonstationary in space. The mean surfaces /i(-,t) 
and the hnal predicted surfaces of for t = 50,125,200 are shown in Figure 


Appendix 


Proof of Theorem 1 

Direct calculation gives 


n 

(i) We hrst show that E ttkfki-) e J", for any given ai,..., G M. 

k=l 


akfki') = OL'(f){s) + /3'(1, xi,..., Xd)', 

k=l 


where 


q: = K-d-idiag(Ai \ ..., J(ad+2,..., a„)', (18) 

f3 = (ai,..., ad+i)' - (X'X)"A'$K-d-idiag(A7\ ..., \~^^_^){ad+2, • • •, an)', 


and Vn-d-i is the submatrix oiV in (13) consisting of its hrst n — d — 1 columns. By the 


dehnition of V, Q^Q = T4 _rf_idiag(Ai,..., \n-d-i)V^_^_i, and hence 


Vn-d-i = Q$QK-d-idiag(Ai ,..., 


(19) 


This together with (18) and X'Q = 0 implies that X'a = 0. Thus ^^akfki') G X i 

k=l 

proved. 

n 

We remain to show that C < akfk{-) : Ofc G M i. We hrst show that 


k=l 


Vn-d-lV^_^_l — Q. 


( 20 ) 
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Latitude Latitude Latitude 



Longitude Longitude 

(cl) (c2) 

Figure 7: (al) /i(s, 50); (a2) y(s,50); (bl) 125); (b2) y(s, 125); (cl) /i(s, 200); (c2) 
y(s,200). 
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From (19) and X'Q = 0, we have XVn-d-iV^_^_]^ = 0. This and the fact that Vn-d-iV^-d-i 
is idempotent of rank n — d — 1 imply that Vn-d-iV^_d_i is the projection matrix for the 
space orthogonal to the column space of X. That is, Vn-d-iVn-d-i — Q- 

d 

Given any /(s) = a'<p{s) + (3o + (3jXj G X, since X'cx. = 0, we can write 

i=i 


fis) = 0(sy(a - XiX’Xy^X’cx) + (l,xi,.. .,Xdyf3 

= (0(s)', ...,Xd) 


K-d-iK-.-i 0 
0 Id+1 




0 K-d-idiag(Aiy...,Aj^_J 

Id+i -{X'X)-^X'^Vn-d-idmg{X^\ 


X 


X'X)~^X'^Vn-d-lV:_,_, Id+1 


a. 


/3 


diag(Ai,...,A„_rf_i)V„'_rf_i 0 

:x'X)-^X'^Vn-d-iV:_,_,cx + f3 
diag(Ai,..., Xn-d-i)Vj^-d-iO^ 


= (/i(s),---,/n(s)) 


where the second equality follows from (20). Thus /(■) G G m|. This 

k=l 

completes the proof of (i). 

(ii) Clearly, J(/i) = • ■ ■ = J(fd+i) = 0. It suffices to show that J(/) = a'^a. > 0 for 

d 

any /(s) = cx'<p{s) + /So + PjXj G XF with ck 7 ^ 0. Since rank(X) = d + 1, X'a = 0 and 

i=i 


CK 7 ^ 0, it follows that ot^ot > 0 (see Section 4 of Micchelli (1986)). This completes the 
proof of (ii). 

(iii) We shall only prove the result for k = d + 2. Given any g[-) G XF, let g = 
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{g{si),g{sn)y = ^cxg + Xf3g. Then g{-) G Xd+2 if and only if 


{cXg,f3g) G {(q:,/3) : X'cx = 0, X'g = 0, and \\g \\2 = l} 

= {(q:,/3) : X'cx = 0^ g = Qg = and ||flr ||2 = l} 

= {(q:,/3) : OL = Qct, (3 = —{X'X)~^X'^OL, and ||Q$Qq :||2 = l}. (21) 


Therefore, from (0 and 


min J{g) = min{Q:'$Q: : ex G M", cx = Q(x, ||Q$Qq:||2 = 1} 
g(')eJ'd+2 

= m\n.{cx Q^Qcx ; ex G M"", ||Q$Qq :||2 = 1} 

= m.in{cx'VXV'cx : a G M”, ||AV'a ||2 = 1} 

= minja'Aa : a G M"", ||Aa ||2 = 1} = A]”^, (22) 


where A = diag(Ai,..., A„). It follows from (21) and (22) that 


(Ai ^vi, -Ai ^{X'X) = argmin { J( 5 () : g{x) = 0(a;)'Q:+(l, X 2 ,.. • ,Xrf)'/3 G Xd+ 2 ]- 

{a,g) 


This proves (iii) and the proof of Theorem 2 is complete. 


Proof of Theorem 2 Let H = TV(P^TV)^^P^, and R = L^Pr- 

It follows from the dehnition of Pft:diag((ix 4 ,..., dK^K)P'j^ and simple algebra that HSH = 
Rdia.g{dK,i, ■ ■ ■, dK^K)R'. Since rank(TVATPj^) < K, the eigen-decomposition of F^MF'^^ 
can be written as Rdia.g{di,... ,dK)R!, where i? is an n x A matrix with orthonormal 
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columns. Using HF^ = we have 


1 -1 


[FkMF'j, + (a| + al)In} 

= {HRdiag[di,..., dK)R'H + ((t| + a‘^)In} 


-1 




di 


d 


K 


cr| + O', 


cr| + 


Veil+ cr| +0-2 fiii'+ cr| + cr^ 


RH. 


Then twice the negative log-likelihood function of Zi,..., zt is 


£(M,a|) = uTlog27r + log \FkMF'j, + (a| + al)In\ + tr{5(F,,MF;, + (a| + 
= nT log 271 -|- log |.Rdiag(eii,..., dK)R' 

di 


-htr| —-S -— -SHRdia.g( 


ct| + cr^ + aj 

i- K 


\d\ -|- ct| -|- 


(T. 


UK 

2 ’ ’ Jx + 0-| + Cle 


nT log 2vr + ^ log (4 + df + - K) log(cr| al) 


4 + V 


k=l 


tr( Rdiag{dK,ii ■ ■ ■ ■,dK,K)I^Rdda.g{ 


+ T 


RH 


M{S) 


di 


d 


K 


\di + dx + cr| + 4 





K 


> nT log 2vr + ^ log (4 + cr| + 4) f + “ R) log(<^| + + 


k=l 


+ T 


MS) 


1 ^ 
(T^ -I- (t2 ^ 


dK,kdk 


4 4 A:=l 4 + + T 


2 ’ 


(23) 


where the last inequality follows from von Neumann’s trace inequality (von Neumann, 1937) 


and the equality holds if and only ii R = R. So given ct|, , (t|) is minimized at 
such that FkMkF'j^ = i?diag(dx,i(o‘|), • • •, dK,K{cr‘^))R', where dK,k{ck1) = niax(d^_fc —cri¬ 
er^, 0); A; = 1,..., iV. It follow that 


Mk = {F'kFk)-M'kFkMkF'j,Fk{F'kFk)-^ 

= {F'j,FK)-M'j,Rd:mg{dK,i, ..., dK,K)R!FK{F'j,FK)-^ 

= {F'kFk)-^>Mk diag(4,i,..., dK,K)PK{F'KFK)-^/\ 


23 




















Finally, replacing dk in the riglithand side of (23) by for k = 1,K, we obtain the 


desired result for This completes the proof. 
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